G3p

Overview
Gene 3 protein (g3p pr pIII) is a minor coat protein found on the surface of filamentous bacteriophage. The protein consists of 406 amino acids divided into three domains interspaced with glycine linkers. Peptides or proteins can be fused to g3p and evaluated for binding or other properties.

Structural Analysis
 Five major papers will be discussed outlining the evolution of structure analysis of g3p.

Initial Observations
The first structure of g3p entered into the PDB was by Holliger and Riechmann in late 1996-early 1997 .They used NMR spectroscopy to create a structure of the first domains of g3p. Shown is a combination of the 15 most energetically favorable states. Observations of secondary structure are below.

In 1997, Lubkowski et al crystallized the first two domains of g3p from M13 phage to 1.46Å resolution. Differences from the first paper published were mainly attributed to turns in the structure, the orientation of the C terminus, and interacting amino acids between the two domains. In addition, the authors made two overall observations: (1) Cis proline near C terminal end and (2) An oxidized tryptophan. The cis proline identified will later turn out to be important for function of the protein. The oxidized tryptophan, however, was not found in other structures.

One year later, another structure containing D1 and D2 (albeit from the filamentous phage, fd, containing a two residue difference) was published using x-ray crystallography to 1.9Å resolution. Dimer formation was observed, but was attributed to the experimental conditions.

Finally, two papers were later published described interactions between g3p and TolA protein (located on the host cell, see.

D1 Domain
 The first structure of g3p entered into the PDB was by Holliger and Riechmann in late 1996-early 1997. The D1 domain consists of mostly beta sheets. Both Holliger and Riechmann as well as Lubkowski et al noted a N terminal alpha helix in their respective publications. This aside, five beta strands arranged as a barrel-like motif, which participates with two other strands from second domain to make an antiparallel sheet. Disulfide bonds exist between Cys 7 and Cys 36 (left handed helix) and Cys 46 and Cys 53 (right handed hook).

D2 Domain
This domain contains eight beta strands: six in a mixed beta sheet and two interacting with D1 antiparallel sheet (β6 and β13 . The amino acids between β6 and β7 doesn’t have a specific motif but has stabilizing hydrophobic interactions with other parts of the domain . Three hairpins exist in this domain: between β8 and β9, β9 and β10, and β10 and β11 (cis proline in the last hairpin) . The final secondary structural element is an alpha helix that interacts with rest of the domain via hydrophobic interactions . Of note, there is a cation-π interaction between His 191 and Phe 199. The C terminus of D2 has seven peptides, 3 of which are proline, 1 of which is in the cis conformation.

Lubkowski et al like the interaction between D1 and D2 to a horseshoe shaped molecule attributing hydrophobic molecules facing toward the center as stabilizing factors.

D3 Domain
Upon review of the literature, no structural analyses of this domain were identified.

Linkers
The linkers don’t have a specific purpose but appear to give the protein better flexibility providing optimal infectivity. In both Lubkowski et al and Holliger et al, these regions were not observed through crystallography study.

Infectivity
 The function of the protein has a close correlation with its structural domains. The D1 domain (blue) interacts with TolA protein (orange) in the periplasm of the bacterial cell. . (The C terminal domain of TolA is the coreceptor for filamentous phage infection of E coli(Cell 90, 351-360 (1997)). The D2 domain binds to F pilus on the outer membrane of Escherichia coli, however, it is also blocks TolA binding to D1 in the absence of the F pilus . In fact, without the D2 domain, infectivity is very low . It has been speculated that D2 interacts with the F pilus first, drawing the phage closer to the bacterial cell, thus allowing D1-TolA interactions to occur . Chatellier et al suggest that the complex formed by D1 and D2 may prevent destruction of the protein from bacterial proteases, and upon binding the protein opens up and D3 can then reach the inner membrane of the bacteria.

The function of D3 was elicited last. D3 domain “anchors” to F pilus and is necessary for phage packaging.

Phage Display
Infection with filamentous phage does not cause host cell lysis or death. The N terminus of g3p can be truncated and the peptide of choice can be inserted. Insertions can also be made between D2 and D3. Specifically, fusions to ther N terminus have no affect on infectivity, between D12 and D3 have 100 fold reduction for peptide insertion, and a 1000 to 100,000 fold for noncovalently interacting peptides. Peptides can be fused to CT domain or to the N1 domain, neither areas are near the central area of the horseshoe.

While the N terminus domain is necessary for infection, the full protein does not need to exist for all five particles on the surface.

Evolutionarily Related Proteins
N1 and N2 have 15% identity but a “nearly identical fold”, question of sharing common origins or a gene duplication.

Used DALI to identify similar proteins with both domains, Lubkowski et al were unable to identify any evolutionarily related proteins. When looking at individual domains, they found some similar proteins.

D1
They reported a correlation with homopexin [(1hxn)] Z score 1.1 (p>0.05). A comparison with a permuted SH3 domain [(1tuc)] was made, but sequence homology did not exist.

D2
Lubkowski et al identified the PDZ domain of Human discs large protein (1pdr) as a potentially related protein (Z score = 2.1). This protein is smaller than g3p, and consequently two beta strands in th core of domain share no identity with D2 .(H and R, 9032075)

3D structures of G3p
2x9b – IF1G3p TolA-binding domain – Enterobacteria phage IF1

3knq - FDG3p residues 19-238 – Enterobacteria phage FD

3dgs - FDG3p residues 0-226 (mutant)

2g3p – FDG3p N-terminal domain

2x9a – IF1G3p TolA-binding domain + TolA C-terminal domain – Enterobacteria phage IF1

1tol – M13G3p N-terminal domain/ M13TolA C-terminal domain – Enterobacteria phage M13

1g3p - M13G3p N-terminal domain

1fgp – FDG3p membrane penetration domain - NMR